A parallel adaptive P3M code with hierarchical particle reordering
نویسندگان
چکیده
We discuss the design and implementation of HYDRA OMP a parallel implementation of the Smoothed Particle Hydrodynamics–Adaptive PM (SPH-APM) code HYDRA. The code is designed primarily for conducting cosmological hydrodynamic simulations and is written in Fortran77+OpenMP. A number of optimizations for RISC processors and SMP-NUMA architectures have been implemented, the most important optimization being hierarchical reordering of particles within chaining cells, which greatly improves data locality thereby removing the cache misses typically associated with linked lists. Parallel scaling is good, with a minimum parallel scaling of 73% achieved on 32 nodes for a variety of modern SMP architectures. We give performance data in terms of the number of particle updates per second, which is a more useful performance metric than raw MFlops. A basic version of the code will be made available to the community in the near future.
منابع مشابه
Tree–Particle–Mesh: an adaptive, efficient, and parallel code for collisionless cosmological simulation
An improved implementation of an N-body code for simulating collisionless cosmological dynamics is presented. TPM (Tree–Particle–Mesh) combines the PM method on large scales with a tree code to handle particle-particle interactions at small separations. After the global PM forces are calculated, spatially distinct regions above a given density contrast are located; the tree code calculates the ...
متن کاملGOTPM: A Parallel Hybrid Particle-Mesh Treecode
We describe a parallel, cosmological N-body code based on a hybrid scheme using the particle-mesh (PM) and Barnes-Hut (BH) oct-tree algorithm. We call the algorithm GOTPM for Grid-of-Oct-Trees-Particle-Mesh. The code is parallelized using the Message Passing Interface (MPI) library and is optimized to run on Beowulf clusters as well as symmetric multi-processors. The gravitational potential is ...
متن کاملA Load Balancing Package on DistributedMemory Systems and its Application
We present a tool, Bisect, for balanced decomposition of spatial domains. In addition to applying a nested bisection algorithm to determine the boundaries of each subdomain, Bisect replicates a user speciied zone along the boundaries of the subdomain in order to minimize future interactions between subdomains. Results of running the tool on the Cray T3D system using both shared memory operation...
متن کاملParallel Implementation of Particle Swarm Optimization Variants Using Graphics Processing Unit Platform
There are different variants of Particle Swarm Optimization (PSO) algorithm such as Adaptive Particle Swarm Optimization (APSO) and Particle Swarm Optimization with an Aging Leader and Challengers (ALC-PSO). These algorithms improve the performance of PSO in terms of finding the best solution and accelerating the convergence speed. However, these algorithms are computationally intensive. The go...
متن کاملA Multi-Scale Electromagnetic Particle Code with Adaptive Mesh Refinement and Its Parallelization
To investigate multi-scale phenomena in space plasma including plasma kinetic effects, we started to develop a new electromagnetic Particle-In-Cell (PIC) code with Adaptive Mesh Refinement (AMR) technique. In AMR simulation, spatial grid size and time step intervals are defined according to the hierarchy levels, where high and low levels correspond to the fine and coarse grid systems, respectiv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Physics Communications
دوره 174 شماره
صفحات -
تاریخ انتشار 2006